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Abstract. This paper presents a model that is useful for developing resource allocation 
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1 Introduction 

The use of distributed computing technology in real-time systems is increasing rapidly. For 
example, an important aspect of the NASA Earth Science vision is its sensor-web, an integrated, 
autonomous constellation of earth observing satellites that monitor the condition of the planet 
through a vast array of instruments. While this concept offers numerous benefits, including cost 
reduction and greater flexibility, its full potential cannot be realized with today’s information 
system technology. Common real-time engineering approaches use “worst-case” execution times 
(WCETs) to characterize task workloads a priori (e.g., see [15, 16]) and allocate computing and 
network resources to processes at design time. These approaches unnecessarily limit the 
functions that can be performed by spacecraft and limit the options that are available for 
handling unanticipated science events and anomalies, such as overloading of system resources. 
These limitations can mean loss of scientific data and missed opportunities for observing 
important terrestrial events. As noted in [7, 13, 19, 20], characterizing workloads of real-time 
systems using a priori worst-case execution times can lead to poor resource utilization, and is 
inappropriate for applications that must execute in highly dynamic environments. 

Adaptive resource management (ARM) middleware (software that resides between the 
computer’s operating system and the computer’s applications) can address this problem by 
dynamically reconfiguring the way in which computing and network resources are allocated to 
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processes. In [15], we examined a command and control system in use by NASA, and explored 
how the components of that system could be distributed across multiple processors in such a way 
that the system remained as robust as before, and at least as capable of meeting its real-time 
processing requirements. We found that many benefits would be realized by treating related 
systems as one unified system that shares a dynamically allocated pool of resources. In [16], we 
explored the possibilities of adaptive resource management for onboard satellite systems. 
Satellites are now sophisticated enough to have multiple onboard processors, yet they generally 
have processes statically assigned to each processor. Little, if any, provision to dynamically 
redistribute the processing load is provided. Onboard instruments are capable of collecting far 
more data than can be downloaded to the Earth, thus requiring idle times between downloads. 
Although download times are known a priori, failed downloads can cause the buffer on the 
satellite to overflow. To handle this situation, ARM middleware autonomously determines the 
following: the allocation of resources to tasks, the fidelity of data processing algorithms (such as 
a cloud cover detection algorithm [1]), the compression type to use on data, when and what to 
download, whether data should be discarded, and the interval for gathering telemetry data from 
various onboard subsystems. Decisions are made based on a system-level benefit optimization 
that takes into account observation schedules, future and current download opportunities, 
satellite health, user-defined benefit functions, and system resource utilization. 

To allow future research efforts in ARM to build upon the foundation that we have established, 
this paper presents our model of dynamic, distributed real-time systems. It also provides an 
algorithm that shows how to employ the model to perform adaptive resource management. In 
[22, 23] we presented static models for resource allocation of real-time systems, and in [24, 25] 
we presented dynamic models. Applications of our dynamic models [26, 27, 28] showed their 
effectiveness for adaptive resource management. However, our previous approaches lacked the 
information needed to gracefully degrade performance in overload situations, did not support 
feasibility analysis or allocation optimization, did not consider security aspects, and did not 
include network hardware. This paper removes those shortcomings by extending the model to 
incorporate knowledge of application profiles, network hardware, utility, and service level 
constructs. 

The remainder of the paper is organized as follows. Section 2 presents the model. In Section 2.1, 
the model of the hardware resources is presented. Section 2.2 describes the model of the software 
system, which consists of subsystems, end-to-end paths, and applications (tasks). Section 3 
shows how to use the model to check global allocation constraints and to perform global 
allocation optimization. A detailed framework for developing allocation algorithms based on the 
model is provided in Section 4. An overview of related research is provided in Section 5. 

2 Mathematical Modeling 

A dynamic real-time system is composed of a variety of software components that function at 
various levels of abstraction, as well as a variety of physical (hardware) components that govern 
the real-time performance of the system. 

2.1 Hardware components 

The physical components of a real-time system can be described by a set of computational 
resources and network resources. The computational resources are a set of host computers H = 


2 


The properties of each host h e H are specified by a set of attributes, among them 

the more important ones are the identifier name{h), the size of the local read only memory 
mem(h), a numerical value sec(h) that specifies the current security level of h, speed factors 
int_spec(h) and float_spec(h) for the integer and floating point SPEC rates respectively, and 
overhead time o(h) for send and receive operations. Computational resources are off-the-shelf 
general purpose machines r unnin g multitasking operating systems. 

The network structure may be formalized as a directed graph N = (Hf) where L is the set of 
physical (undirected or directed) links between host nodes. Each link / e L has a fixed bandwidth 
bandwidth(l) and operates in a mode opjnode(l) which is either half duplex or full duplex. Vie 
describe the connections between hosts by a function link : H x H -> L . It is furthermore 
assumed that pairs of hosts h and h' are connected by a fixed communication path described by a 
function route: H x H -» P(T) where P(T) is the set of all simple paths in T describing the basic 
routing information. Associated with each pair (h { ,h 2 ) of hosts is a propagation delay 
p_delay(h l ,h 2 ) measured in either packets per second or bits per second. An additional queuing 
delay may be considered in case of heavy communication load. 

It is generally assumed that the set of resources and network topology are fixed. 

2.2 Software components 

While we assume that the hardware resources are fixed, the parameters that effect the 
performance of the software components may change dynamically. Nevertheless, we assume that 
the operating conditions and parameters of the software components are constant at least for 
some time interval. 

To software components of a dynamic real-time system can be decomposed in several 
abstraction levels: the system, consisting of several sub-systems, each being a set of paths of 
application software (see Figure 1). 



2.2.1 The System The highest level of abstraction represents the system. A system S = {SS t , 
...» SS m[ } is considered as a collection of sub-systems. There are no specific attributes associated 

with a system. It simply represents the entire set of sub-systems that are currently being executed 
on a single system. 

2.2.2 Sub-Systems The next level of abstraction is that of sub-systems. A sub-system represents 
some part of the system that can be separated semantically from the total system. A sub-system 
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SS = {P ls ...,P m , } is simply a collection (set) of paths, along with a priority prio(SS) and a 

security level sec(SS). The priority is user-defined and determines the perceived priority of the 
given collection of paths. The sub-system priority and security level are inherited by the paths 
and applications in the sub-system. 

2.2.3 Paths The next lower level of abstraction in a real-time system is the notion of a path. A 
path P, consists of a set of applications A,- c A and a precedence relation -< t . The precedence 
relation provides information concerning the execution order of the applications in a path, as 
well as their communication characteristics. We will assume that the transitive closure of the 
precedence relation -<,• defines unique largest and smallest elements in A t . Different paths may 
share the same application, as shown in Figure 2. 



Figure 2: Example for overlapping paths 


There are two basic types of paths: periodic paths and event-driven paths. Each periodic path P f 
has a given period Jt t . Modeling a periodic path implies that the path has to be executed exactly 
once in each period. Each event driven path P,- has a maximum event rate r,-, which is generally 
not known, and a deadline d t . It is assumed that the deadlines are hard in the sense that it is not 
allowed to complete a path later than the deadline. In this paper, we model event driven paths as 
periodic paths where the period is the inverse of the event rate, = 1 lr t . The reason is that, 
choosing 1/r,- as the period, covers the worst case scenario: if the paths can be scheduled feasibly 

with maximum event rates, then we are sure to have a feasible situation in case of smaller event 
rates. In each period there is a deadline that is d t time units after the start of the period. 

There are two more attributes: Each path Pj has a dynamic workload w(Pj) that is essentially 
defined by the amount of input data for Pj , and a priority that is inherited from the sub-system Pj 
belongs to: prio(Pj) \-prio(SS). 

As for the notation, the paths’ workloads and maximum event rates are collected in vectors, the 
workload vector w and event rate vector r , respectively. 

2.2.4 Applications At the lowest level of abstraction, the software components of a real-time 
system consist of a set of applications A = {oj, ..., a„}. Each application a has some workload 

w a . For simplicity we assume that applications in a path inherit the workload of the path: 
application a of path P, has workload w a = w(P,). Thus, overlapping paths (i.e. paths that have 
common applications as in Figure 2) have equal workloads. 
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One of the main objectives is to fin d an optimal allocation of the applications to host computers. 
Such an allocation, formally described by a function host : A -» H , has to fulfill runtime 
conditions and memory limitations on the hosts . Both, execution time and memory usage of an 
application depend not only on its workload and service level parameters, but also on the 
machine on which it is being executed. 

We assume that there exists a set of n global service levels [6] S = {^ a | a e A} (one for each 
application), each of which may be set to an arbitrary value of M . This (potentially 
multidimensional) parameter affects the level of service to the user, and therefore affects the 
overall utility of the system. Service level setting is defined for each application separately. This 
parameter also affects the running time of the application. 

For each application a e A, each host h e H, each workload w e N and each service level s e 
M, we define r ah (w a , s a ) as the processing time, i.e., the amount of time that a response requires 
when an application a is executed on host h with workload w a and service level parameter s a 
[11]. Similarly, m ah (w a , s a ) is the amount of memory used by application a in the same setting. 
In addition, to avoid non-eligible or security violating allocations, we make the following 
assumptions concerning r a h and m a h : 

(i) r a h (w a , s a ) = oo and m a h (w a , s a ) - oo if application a cannot be executed on h. This may occur 
if h is not an eligible host for a, or if there would be a security violation if a were to be 
executed on h. 

(ii) Both r ah {w a , s a ) and m ah (w a , sj are assumed to be monotonically non-decreasing in w a and 

, i-e., 

if w a < w a r and s a < s a > (component-wise), 

then r a,h( w w s a) £ r a,h( w a'’ V) and m a,h( w a> s a) ^ m a,h( w a'> V) • 

If applications of a path are allocated to different hosts, data transmission between the hosts will 
be required. If a <j a', it is assumed that application a and application a' communicate via 

interprocess communication in the local area network. The amount of communication in a path 
depends on the workload of the path. Given a workload w a and a setting of the service level s a , 

application a sends c aa ,(w a , sj bits of information to a'. We assume that c a a '{w a , s a ) is a 
monotonically non-decreasing function of the workload of a. 

For each a e P, , a priority may be associated by defining p a := prio(SS) where SS is the uniquely 
defined sub-system that holds a path with application a. Priorities are useful to achieve certain 
overall system objectives. 

3 The Resource Manager 

The resource manager (RM) is responsible for the correct operation of the whole system. As 
input, it is given the static characteristics of both the hardware system and the software systems. 
The resource manager can not modify these properties. However, the resource manager is 
responsible for making all resource allocation decisions and has the ability to modify certain 
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performance parameters such as service levels. In this section we consider the constraints that 
must be satisfied and the optimizations that a resource manager can perform. 

3.1 Global Allocation Constraints 

In all situations the resource manager must provide an allocation that meets the constraints of the 
system. The proposed framework supports three constraints. First, the resource manager must 
ensure that each application is assigned to a valid host, one that is capable of executing the 
application. Second, the security level of each application should not be larger than the security 
level of both the assigned host and any communication links used in the current path. Third, the 
amount of time needed by any path to complete execution must not exceed the required deadline. 
The minimum responsibility of the resource manager is to choose an allocation of applications to 
hosts such that these three constraints are satisfied at a given setting for service levels, 
workloads, and arrival rates. A feasible solution is the specification of a function host : A -+ H 
that satisfies all the allocation constraints. 

3.2 Global Allocation Optimizations 

In addition to constraint-satisfaction, a resource manager should have the ability to perform 
various allocation optimizations. The objective is to find an allocation and setting of unknown 
performance parameter values such that all applications can be scheduled feasibly and the overall 
utility is maximized. The proposed model supports three performance parameters: maximum 
workload, maximum event rate, and service level. The workload and maximum event rate of an 
application are generally unknown. For this reason, the resource manager attempts to maximize 
the arrival rate and workload that can be handled by a particular allocation according to some 
heuristic. In addition the service level of an application is a knob that the resource manager can 
use to adjust both the resource usage and the overall utility. 

The overall utility of a system can be determined from the maximum workloads, maximum event 
rates and service levels that are computed by the optimization heuristic. We formalize the overall 

utility as a function U{S) = U(s, w, r). Depending on the given characteristics of the system, 
there are many ways to specify such a function. An example system requiring fair distribution of 
resources is Dynbench [need reference], a shipboard missile detection and guidance system. The 
following product utility functions are able to handle such scenarios: 

U,(S) = U(s, w,r) = U(s)- min{w a }- min{r a } 

aeA aeA 

U 2 ( S ) = U(s, iv, r) = U(s)- mini — i • min 

“ 6A [ Pa J “ eA 

These functions can be used to prevent the resource starvation of lower priority applications. A 
weighted sum utility function does not prevent resource starvation, but allows higher priority 
Applications to obtain as many resources as needed for critical operation. These are example 
weighted sum utility functions: 




P, 
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Some systems may require more complex utility functions. For example, we could combine the 
functions defined above in the following way: 

U a (S) = a-U 2 (S) + (l-a)-U<(S), 

where a e [o,l] can be used as a control parameter to mix the strategies explained above. 

The considerations and algorithmic approach presented in the remainder of this paper require 

monotonicity as an important property of the overall optimization function U(S) = U(s, w, r), 
which can be described as follows: 

?<?'=> U(s) < [/(s') 
vv < vv' => min{w) < min(w r ) 

r <r' => min(r) < min(r r ) 

3.3 Considerations on Constraints and Optimizations 

The following considerations are helpful in understanding our algorithmic approach defined in 
the next section. Assume for simplicity that instead of s, w, r there are only two system 
parameters, p\ and pz- Each may attain integer values > 0. So the question is for which pairs (p\, 

pi) the system behaves correctly and utility U has a maximum. From the monotonicity 
assumption we conclude that we need only to look for pairs (p\,p 2 ) that are maximal: (p\,pi) is 
maximal if each pair (pi, pi) * (p\, pi) with p\ > p\ and pi > pz, does not allow a feasible 
solution. Feasibility is checked by means of a heuristic algorithm such as threshold accepting, or 
simulated annealing, by directly finding an allocation of applications to host. 

Maximal pairs can be determined by a systematic search: First one would find upper limits 
separately for p\ and pz , while keeping the other value at minim um. Let p\ mU and pz the 
respective maximum values. This can be done by a doubling strategy, by starting with 1 for p\ 
resp. p 2 - With known values p\ mU and p™*, an off-line algorithm could determine maximum 
pairs (p\, pi) with 0 < pi < p\ mt and 0 < pz ^ p™ 1 * . Since we assume non-negative integer 

parameters, the number of pairs to check is limited by (pf™* + + !)• Figure 3 illustrates 

the maximum parameter pairs (black dots). The pairs lying below and left of each maximum 
parameter pair allow feasible solutions. 


[? < s' means component-wise] , 

r • f W + 1 1 . fw'+l 

[or nun — — ^ < min — s — — 

aeA n aeA r> f 


[ormin^- 2 > < min^ — y - > ]. 
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The generalization to the general parameter set s, w, r is straightforward. Knowing the 
parameter area with feasible solutions is useful for on-line algorithms; if the running system 
requires certain parameter settings, the feasibility of the settings can be checked easily. 

4 A Framework for Allocation Algorithms 

In this chapter a framework for allocation algorithms is presented with the objective to maximize 
overall utility. The utility of an allocation is a function of the service levels, calculated maximum 
workloads and calculated maximum event rates. The utility function does not depend on the 
particular structure of a solution, but assumes that the schedule is feasible. 

Before discussing allocation algorithms, we must explore the differences between off-line and 
on-line algorithms. Off-line algorithms are performed before a system has been started, and thus 
are not limited by tight time constraints. For this reason, these algorithms may be brute force 
algorithms that are capable of finding optimal allocations and performance parameter settings. 
On-line algorithms on the other hand are executed simultaneously with the dynamic systems for 
which they are responsible for allocating resources. These types of algorithms operate under 
strict timing constraints and are typically used for making fast, intelligent reallocation decisions. 

The framework proposed in this section is decomposed into several modules. An off-line 
algorithm could take advantage of all the functionality provided by these modules. In contrast an 
on-line algorithm may require the use of only a subset of the modules presented due to strict 
timing requirements. For this reason, we will look at each of the modules in the context of an off- 
line algorithm. The structure of such an algorithm is shown in Figure 4. 
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Figure 4: Structure and Modules of an Off-line Allocation Algorithm. 


Initially, the define Jnterf ace module uses the hardware and software specifications to determine 

the initial settings for the triplet (s, w, r) and the corresponding overall utility U(s, w, r). The 
initial _allocation module constructs an allocation of applications to hosts subject to the 
conditions of the triplet (?, w, r). The feasibility Jest module determines whether the allocation 
is feasible. If the allocation is feasible and stopping_criterionl has not been satisfied, then the 
modify parameters (+) module increases the performance parameter settings resulting in a new 
setting for (s, w, r) thus increasing the overall utility. However, if the allocation found was not 
feasible and stopping_criterion2 has not been satisfied, then the optimize allocation module 
modifies the allocation subject to the triplet (s, w, r) by using optimization procedures such as 
general local search procedures, and greedy heuristics. This step continues until a feasible 
allocation is found or stopping_criterion2 is satisfied. If stopping _criterion2 is satisfied and the 
allocation is still not feasible, then the modify parameters (-) module decreases the performance 
parameter settings resulting in a new setting for ( s , w, r) causing the overall utility to decrease. 
After either the modify paramteres(+) or modify paramters(-) has been executed, the algorithm 
reenters the initial -allocation module and the process continues. We will now look at each of 
these modules in more detail. 

4.1 Module definejnterface 

The module definejnterface provides interfaces between the resource manager and the 
allocation algorithm and provides the data structures to store the needed information for the 
operation of the allocation algorithm. The resource manager provides the module with the static 
characteristics of both the hardware and software systems as described in section 2. The module 
uses this information to produce initial settings for the unknown performance parameters and the 
service level of each application. These initial settings are represented by the triplet ( s , w, r). 

The module returns this triplet and the corresponding initial overall utility U(s, w, r). Figure 5 
provides more detail about this module. For example, the latency function in Figure 5 represents 
the actual amount of time the task will take to complete processing due to the resource needs of 
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the application, the resource characteristics, and the contention for the needed resources. We do 
not provide a complete listing of all the functions that should appear in this module, but include 
some of the more essential and understandable functions. 


module define Jnterf ace 

O defines initial settings for service levels, workloads, event rates, 
— > — > — > 

and returns a triplet ( s , w , r ) 

— y 

for example : initial service level setting s = (1, 


initial workloads w = (1, 

initial event rates ~r = (0, 0) 

O provides modules for computing 

runtime r Q }fw a , $ a ) of application a 

latency X Q s a ) of application a 

memory requirement s a ) of application a 

communication time c a a > (w a , s a ) for applications a < a' 

— — > — ► — > 
system benefit U( s , w , r ) 


Figure 5: Module definejnterface. 


4.2 Module initial_allocation 

The module initial_allocation constructs an allocation of applications to hosts such that their 
runtimes are minimized. However, runtime minimiz ation cannot be expected to be fully achieved 
due to limited processor power and memory. It is important to realize that the minimization of 
runtimes is not the overall objective of an allocation algorithm, but a mechanism for producing a 
reasonable initial allocation. Figure 6 contains the heuristic used by this module. The allocation 
is represented as a function host :A->H, and the heuristic strategy follows a two-dimensional 
bin-packing approach. 

module initial _allocation 

Input: parameters s , w , r 
Output : function host \A^>H 
implementation 
procedure host ; 

initialize cpu_availabile(h) := 0.7; memjsvailabileQi) := mem(h); 
for each path^? do 

for each application a on p do 

assign a to host h such that X a /,( [w ^ s a ) is minimum 
subject to r a fi(w a , s a )fxp ^ cpujavailableQ i) 
and m a jfw a> s a ) < mem_ availableQi ); 

- latency s a ) := r a h (w a , s a ) + queuing delay 

reduce cpu_availableQt) by r a /j(w a , s^hZpt 
reduce mem_ availableQi) by rn a ^ w a> s a)' 


Figure 6: Module initial_allocation. 
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Module feasibilityjtest 

The module feasibilityjest implements a test to analyze the feasibility of a solution. A feasible 
solution is a function host: A ->H that satisfies all the allocation constraints identified in section 
3.1. The test requires invoicing functions provided by the define interface module and using the 
returned estimations to determine the feasibility of the allocation. If the feasibility test fails, then 
the allocation must be modified. If stopping_criteria2 is not true, then the optimize allocation 
module is invoked to move applications to different hosts. If stopping_criteria2 is true, then the 
parameters (s, w, r) are changed such that the overall utility is decreased. Since we assume 
montonicity, decreasing these parameters results in lower resource needs. Once a feasible 
solution is found, the parameters (s, w, r) are changed such that overall utility is increased 
unless stoppingjcriterial is true. 

At this level we have to deal with resource contention for all resources. In our proposed model 
we have considered the processor, memory, and network links. For the processor and memory, 
the feasibility test must determine if the utilization thresholds are not violated. Since contention 
is present, the latency, defined as the time to complete processing, for a path must be less than 
the required deadline minus the start time. Contention is encountered in both the processor and 
the network link. For the processor on a time-shared operating system like UNIX, direct analysis 
of the response time due to dynamic priority round-robin scheduling can be performed to 
determine the latency of a single application. For communication delays each pair of dependent 
applications a -< a' on different hosts gives rise to a communication task c a a - . The size of c aa < is 
specified as the number of output bits or packets produced by application a. The latency of 
transmission depends on the technical network properties and the queuing delays due to the 
current network traffic. The latency of a path is defined as the summation of the latencies of all 
the applications and communication tasks belonging to the path. 

4.3 Module optimizejallocation 

The module optimizejallocation is entered when an allocation has failed to pass the feasibility 
test and stopping _criteria2 is not satisfied. This module implements functions for modifying the 

allocation under the conditions of the given parameter settings s, w and r. This is done by 
creating a neighborhood of allocations. For defining a neighborhood allocation we provide the 
operator defined below: 

move{host, a, h ) = host' 

The operator requires the current solution host, an application a, and a target host h as parameters 
and returns a new solution host' that is equal to host except for moving application a to host h if 
possible. The operator results in the assignment of application a to host h. For a given allocation 
function host : A -» H, the neighborhood N(/jost) can be defined as the allocation function. 

N(/jast) = { move(host, a, h) \ a e A, h e H} . 

Other neighborhoods might be necessary to further improve the efficiency and performance of 
the optimization technique. The neighborhood functions are the basis for heuristic optimization 
algorithms to improve the allocation for given parameter settings. General purpose local search 
optimization heuristics such as simulated annealing, tabu search, and evolutionary algorithms 
can be implemented as swappable components within this module. 
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Module modify parameters 


The module modify parameters is responsible for modifying the performance parameters s, w 
and r. When notified that a feasible solution exists for the current parameter settings (s,w,r), 
this module will find new parameter settings (s',w',r r ) that results in a higher system utility. We 
write this condition as: 


U(s',w',r r ) > U(s,w,r ) 


Due to the monotonicity assumption made in section 3.2, such a selection of parameters results 
in higher resource requirements. The algorithm must attempt to find a new allocation subject to 
new parameter settings. 

The module modify _paramters may also be notified when a feasible solution can not be found 
for the current parameter settings (?,w,r). The module proceeds to find new parameter settings 
(s',w',r r ) that results in a lower system utility. We write this condition as: 


U(s',w’,r t ) < U(s,w,r) 


This selection of parameters results in lower resource requirements due to the monotonicity 
assumption. This implies that a feasible solution may exist for the new parameters. 

5 Related Research 

The framework we have presented has been influenced by many growing fields of research in the 
real-time community. In particular, we have designed our model to allow dynamic resource 
allocations, permit dynamic profiling, incorporate utility models, and utilize application service 
levels. In this section, we discuss some of the research that most influenced our model. 

DQM [2] uses QoS levels (service levels in our model) to adapt multimedia applications to 
overload situations. The use of QoS levels enables DQM to gracefully degrade to overload 
situations. However, DQM uses a worse case execution time, as in [Liu, WCET], to determine 
application resource usage. It does not reallocate tasks at run-time, only considers one resource, 
and does not guarantee the optimal, or even near-optimal, set of choices have been made for 
every situation. 

Q-RAM [14] uses a utility function approach to dynamically determine what service levels to 
choose for a group of applications. Utility can be nearly optimized at run-time by dynamically 
allocating multiple finite resources to satisfy multiple service levels. A drawback to the model is 
the use of profiles determined a priori that are associated with each service level. In [5], a similar 
problem is addressed, but the notion of utility is simpler. The same drawback is present in [5] as 
in [14]. 

In QuO [18], applications adjust their own service levels to improve performance and adjust to 
their environment. The model has many drawbacks for dynamic environments. It does not treat 
all resources within the system as a single set of resources, so reallocations do not occur. 
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Applications react to the environment on their accord, so there is no way to optimize the set of 
choices made for all applications. 

Bums et al [3] present an explanation on the need for utility-based scheduling in dynamic, real- 
time environments. Their model includes a set of different service levels, alternatives, for tasks. 
They also present a manner for elicitation of utility preferences. However, they characterize 
resource usage on worst case execution time and they do not take many dynamic measures into 
account, such as workload and event arrival rate. 

In [7, 8, 9], Kalegoraki et al use dynamic object profiling techniques to determine resource 
usage, and resource reallocation techniques are implemented as cooling and heating algorithms 
to ensure load balancing. A utility function is used to determine what applications to replicate for 
fault tolerance. Application relations are defined by a graph and referred to as a task. The 
approach does not include much in the way of a utility optimizationfor resource allocations, 
except for fault tolerance, and does not include service levels of any type. 

In other works, we were mostly concerned with notions of service levels. Liu et al[17], use a 
notion of service levels where tasks are defined by a mandatory task and an optional task. The 
optional task’s operation may be cut off at anytime to get an output and save resources for other 
tasks. The optional task’s utility increases with time, until it reaches a maximum. In the Elastic 
Scheduling technique [4], applications are modeled as springs with associated elastic 
coefficients. The service level for an application is lowered by compressing the application, and 
the service level is raised by allowing die application to expand. The FLEX language [10] allows 
programmers to define performance polymorphism to allow a set of alternate algorithms to be 
executed for one function. 


6 Conclusions 

In this paper, we have presented a model that characterizes distributed real-time systems 
operating in dynamic environments. Distributed resources are treated as a pool of resources to be 
used by the real-time system as a whole. Dynamic environment characteristics are modeled by 
event arrival rates, workloads, and service levels. The notions of utility and service levels 
provide a means for graceful degradation and give a manner to optimize the allocation of 
resources. A framework is presented to produce feasible, optimal allocations even when 
applications receive unknown event arrival rates and process dynamic amounts of workload. 

Future work includes producing a more practical service level parameter definition, integrating 
fault tolerance into the utility functions as done in [9], and allowing for load sharing techniques 
among replicas. 
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